IEICE global.ieice.org Site

Author Search Result

[Author] Takao ONOYE(65hit)

21-40hit(65hit)

Implementation of Viterbi Decoder toward GPU-Based SDR Receiver
Kosuke TOMITA Masahide HATANAKA Takao ONOYE

PAPER

Vol:
E98-A No:11
Page(s):
2246-2253
Viterbi decoding is commonly used for several protocols, but computational cost is quite high and thus it is necessary to implement it effectively. This paper describes GPU implementation of Viterbi decoder utilizing three-point Viterbi decoding algorithm (TVDA), in which the received bits are divided into multiple chunks and several chunks are decoded simultaneously. Coalesced access and Warp Shuffle, which is new instruction introduced are also utilized in order to improve decoder performance. In addition, iterative execution of parallel chunks decoding reduces the latency of proposed Viterbi decoder in order to utilize the decoder as a part of GPU-based SDR transceiver. As the result, the throughput of proposed Viterbi decoder is improved by 23.1%.
VLSI Architecture of Switching Control for AAL Type2 Switch
Masahide HATANAKA Toshihiro MASAKI Takao ONOYE Koso MURAKAMI

PAPER

Vol:
E83-A No:3
Page(s):
435-441
This paper presents the switching control and VLSI architecture for the AAL2 switch. The ATM network with the AAL2 switch can efficiently transmit low-bit-rate data, even if the network has many endpoints. The switch is capable of not only switching AAL2 cells but also converting the header of other types of ATMs. The AAL2 switch is integrated into a single chip. The proposed ATM network is constructed by AAL2 switches attached to the ATM switches.
Thermal-Comfort Aware Online Co-Scheduling Framework for HVAC, Battery Systems, and Appliances in Smart Buildings
Daichi WATARI Ittetsu TANIGUCHI Francky CATTHOOR Charalampos MARANTOS Kostas SIOZIOS Elham SHIRAZI Dimitrios SOUDRIS Takao ONOYE

INVITED PAPER

Pubricized:
2022/10/24
Vol:
E106-A No:5
Page(s):
698-706
Energy management in buildings is vital for reducing electricity costs and maximizing the comfort of occupants. Excess solar generation can be used by combining a battery storage system and a heating, ventilation, and air-conditioning (HVAC) system so that occupants feel comfortable. Despite several studies on the scheduling of appliances, batteries, and HVAC, comprehensive and time scalable approaches are required that integrate such predictive information as renewable generation and thermal comfort. In this paper, we propose an thermal-comfort aware online co-scheduling framework that incorporates optimal energy scheduling and a prediction model of PV generation and thermal comfort with the model predictive control (MPC) approach. We introduce a photovoltaic (PV) energy nowcasting and thermal-comfort-estimation model that provides useful information for optimization. The energy management problem is formulated as three coordinated optimization problems that cover fast and slow time-scales by considering predicted information. This approach reduces the time complexity without a significant negative impact on the result's global nature and its quality. Experimental results show that our proposed framework achieves optimal energy management that takes into account the trade-off between electricity expenses and thermal comfort. Our sensitivity analysis indicates that introducing a battery significantly improves the trade-off relationship.
An In-Vehicle Auditory Signal Evaluation Platform based on a Driving Simulator
Fuma SAWA Yoshinori KAMIZONO Wataru KOBAYASHI Ittetsu TANIGUCHI Hiroki NISHIKAWA Takao ONOYE

PAPER-Acoustics

Pubricized:
2023/05/22
Vol:
E106-A No:11
Page(s):
1368-1375
Advanced driver-assistance systems (ADAS) generally play an important role to support safe drive by detecting potential risk factors beforehand and informing the driver of them. However, if too many services in ADAS rely on visual-based technologies, the driver becomes increasingly burdened and exhausted especially on their eyes. The drivers should be back out of monitoring tasks other than significantly important ones in order to alleviate the burden of the driver as long as possible. In-vehicle auditory signals to assist the safe drive have been appealing as another approach to altering visual suggestions in recent years. In this paper, we developed an in-vehicle auditory signals evaluation platform in an existing driving simulator. In addition, using in-vehicle auditory signals, we have demonstrated that our developed platform has highlighted the possibility to partially switch from only visual-based tasks to mixing with auditory-based ones for alleviating the burden on drivers.
High-Level Synthesis of a Multithreaded Processor for Image Generation
Takao ONOYE Toshihiro MASAKI Isao SHIRAKAWA Hiroaki HIRATA Kozo KIMURA Shigeo ASAHARA Takayuki SAGISHIMA

PAPER-VLSI Design Technology and CAD

Vol:
E78-A No:3
Page(s):
322-330
The design procedure of a multithreaded processor dedicated to the image generation is described, which can be achieved by means of a high-level synthesis tool PARTHENON. The processor employs a multithreaded architecture which is a novel promising approach to the parallel image generation. This paper puts special stress on the high-level synthesis scheme which can simplify the behavioral description for the structure and control of a complex hardware, and therefore enables the design of a complicated mechanism for a multithreaded processor. Implementation results of the synthesis are also shown to demonstrate the performance of the designed processor. This processor greatly improves the throughput of the image generation so far attained by the conventional approach.
Field Slack Assessment for Predictive Fault Avoidance on Coarse-Grained Reconfigurable Devices
Toshihiro KAMEDA Hiroaki KONOURA Dawood ALNAJJAR Yukio MITSUYAMA Masanori HASHIMOTO Takao ONOYE

PAPER-Test and Verification

Vol:
E96-D No:8
Page(s):
1624-1631
This paper proposes a procedure for avoiding delay faults in field with slack assessment during standby time. The proposed procedure performs path delay testing and checks if the slack is larger than a threshold value using selectable delay embedded in basic elements (BE). If the slack is smaller than the threshold, a pair of BEs to be replaced, which maximizes the path slack, is identified. Experimental results with two application circuits mapped on a coarse-grained architecture show that for aging-induced delay degradation a small threshold slack, which is less than 1 ps in a test case, is enough to ensure the delay fault prediction.
An Embedded Zerotree Wavelet Video Coding Algorithm with Reduced Memory Bandwidth
Roberto Y. OMAKI Gen FUJITA Takao ONOYE Isao SHIRAKAWA

PAPER-Image

Vol:
E85-A No:3
Page(s):
703-713
A wavelet based algorithm for scalable video compression is described, with the main focus put on memory bandwidth reduction and efficient VLSI implementation. The proposed algorithm adopts a modified 2-D subband decomposition scheme in conjunction with a partial zerotree search for efficient Embedded Zerotree Wavelet coding. The experiment with the performance of the proposed algorithm in comparison with that of conventional DWT, MPEG-2, and JPEG demonstrates that the image quality of the proposed algorithm is consistently superior to that of JPEG, and our scheme can even outperform MPEG-2 in some cases, although it does not exploit the inter-frame redundancy. In spite of the performance inferiority to the conventional DWT, the proposed algorithm attains significant reduction of DWT memory requirements, enhancing a reasonable balance between implementation cost and image quality.
Efficient 3-D Sound Movement with Time-Varying IIR Filters
Kosuke TSUJINO Wataru KOBAYASHI Takao ONOYE Yukihiro NAKAMURA

PAPER-Speech/Audio Processing

Vol:
E90-A No:3
Page(s):
618-625
3-D sound using head-related transfer functions (HRTFs) is applicable to embedded systems such as portable devices, since it can create spatial sound effect without multichannel transducers. Low-order modeling of HRTF with an IIR filter is effective for the reduction of the computational load required in embedded applications. Although modeling of HRTFs with IIR filters has been studied earnestly, little attention has been paid to sound movement with IIR filters, which is important for practical applications of 3-D sound. In this paper, a practical method for sound movement is proposed, which utilizes time-varying IIR filters and variable delay filters. The computational cost for sound movement is reduced by about 50% with the proposed method, compared to conventional low-order FIR implementation. In order to facilitate efficient implementation of 3-D sound movement, tradeoffs between the subjective quality of the output sound and implementation parameters such as the size of filter coefficient database and the update period of filter coefficients are also discussed.
Hardware Architecture of the Fast Mode Decision Algorithm for H.265/HEVC
Wenjun ZHAO Takao ONOYE Tian SONG

PAPER-VLSI Design Technology and CAD

Vol:
E98-A No:8
Page(s):
1787-1795
In this paper, a specified hardware architecture of the Fast Mode Decision (FMD) algorithms presented by our previous work is proposed. This architecture is designed as an embedded mode dispatch module. On the basis of this module, some unnecessary modes can be skipped or the mode decision process can be terminated in advanced. In order to maintain a higher compatibility, the FMD algorithms are unitedly designed as an unique module that can be easily embedded into a common video codec for H.265/HEVC. The input and output interfaces between the proposed module and other parts of the codec are designed based on simple but effective protocol. Hardware synthesis results on FPGA demonstrate that the proposed architecture achieves a maximum frequency of about 193 MHz with less than 1% of the total resources consumed. Moreover, the proposed module can improve the overall throughput.
Jitter Amplifier for Oscillator-Based True Random Number Generator
Takehiko AMAKI Masanori HASHIMOTO Takao ONOYE

PAPER-Cryptography and Information Security

Vol:
E96-A No:3
Page(s):
684-696
We propose a jitter amplifier architecture for an oscillator-based true random number generator (TRNG). Two types of latency-controllable (LC) buffer, which are the key components of the proposed jitter amplifier, are presented. We derive an equation to estimate the gain of the jitter amplifier, and analyze sufficient conditions for the proposed circuit to work properly. The proposed jitter amplifier was fabricated with a 65 nm CMOS process. The jitter amplifier with the two-voltage LC buffer occupied 3,300 µm2 and attained 8.4x gain, and that with the single-voltage LC buffer achieved 2.2x gain with an 1,700 µm2 area. The jitter amplification of the sampling clock increased the entropy of a bit stream and improved the results of the NIST test suite so that all the tests passed whereas TRNGs with simple correctors failed. The jitter amplifier attained higher throughput per area than a frequency divider when the required amount of jitter was more than two times larger than the inherent jitter in our test-chip implementations.
Reliability-Configurable Mixed-Grained Reconfigurable Array Supporting C-Based Design and Its Irradiation Testing
Hiroaki KONOURA Dawood ALNAJJAR Yukio MITSUYAMA Hajime SHIMADA Kazutoshi KOBAYASHI Hiroyuki KANBARA Hiroyuki OCHI Takashi IMAGAWA Kazutoshi WAKABAYASHI Masanori HASHIMOTO Takao ONOYE Hidetoshi ONODERA

PAPER-High-Level Synthesis and System-Level Design

Vol:
E97-A No:12
Page(s):
2518-2529
This paper proposes a mixed-grained reconfigurable architecture consisting of fine-grained and coarse-grained fabrics, each of which can be configured for different levels of reliability depending on the reliability requirement of target applications, e.g. mission-critical applications to consumer products. Thanks to the fine-grained fabrics, the architecture can accommodate a state machine, which is indispensable for exploiting C-based behavioral synthesis to trade latency with resource usage through multi-step processing using dynamic reconfiguration. In implementing the architecture, the strategy of dynamic reconfiguration, the assignment of configuration storage and the number of implementable states are key factors that determine the achievable trade-off between used silicon area and latency. We thus split the configuration bits into two classes; state-wise configuration bits and state-invariant configuration bits for minimizing area overhead of configuration bit storage. Through a case study, we experimentally explore the appropriate number of implementable states. A proof-of-concept VLSI chip was fabricated in 65nm process. Measurement results show that applications on the chip can be working in a harsh radiation environment. Irradiation tests also show the correlation between the number of sensitive bits and the mean time to failure. Furthermore, the temporal error rate of an example application due to soft errors in the datapath was measured and demonstrated for reliability-aware mapping.
Low-Power Scheme of NMOS 4-Phase Dynamic Logic
Bao-Yu SONG Makoto FURUIE Yukihiro YOSHIDA Takao ONOYE Isao SHIRAKAWA

LETTER-Low-Power Circuit Technique

Vol:
E82-C No:9
Page(s):
1772-1776
An NMOS 4-phase dynamic logic scheme is described, which is intended to achieve low-power consumption in the deep submicron design. In this scheme, the short-circuit current is eliminated, and moreover, the voltage swing of transition signals is reduced, resulting in enhancing power reduction effectively. First, distinctive features of this 4-phase dynamic logic are specified, as compared with the static CMOS logic and dynamic domino CMOS logic. Then, power simulations are attempted for the 4-phase dynamic logic, static CMOS logic, dynamic CMOS logic, and pass-transistor logic, by using a number of logic modules, which demonstrate that the NMOS 4-phase dynamic logic is the most power-efficient. Moreover, through the gate delay simulation, the capability of how many transistors can be packed in a logic block is also discussed.
Signal-Dependent Analog-to-Digital Conversion Based on MINIMAX Sampling
Igors HOMJAKOVS Masanori HASHIMOTO Tetsuya HIROSE Takao ONOYE

PAPER

Vol:
E96-A No:2
Page(s):
459-468
This paper presents an architecture of signal-dependent analog-to-digital converter (ADC) based on MINIMAX sampling scheme that allows achieving high data compression rate and power reduction. The proposed architecture consists of a conventional synchronous ADC, a timer and a peak detector. AD conversion is carried out only when input signal peaks are detected. To improve the accuracy of signal reconstruction, MINIMAX sampling is improved so that multiple points are captured for each peak, and its effectiveness is experimentally confirmed. In addition, power reduction, which is the primary advantage of the proposed signal-dependent ADC, is analytically discussed and then validated with circuit simulations.
A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour
Pramual CHOORAT Werapon CHIRACHARIT Kosin CHAMNONGTHAI Takao ONOYE

PAPER-Image Processing

Vol:
E96-A No:11
Page(s):
2169-2178
In tooth contour extraction there is insufficient intensity difference in x-ray images between the tooth and dental bone. This difference must be enhanced in order to improve the accuracy of tooth segmentation. This paper proposes a method to improve the intensity between the tooth and dental bone. This method consists of an estimation of tooth orientation (intensity projection, smoothing filter, and peak detection) and PCA-Stacked Gabor with ellipse Gabor banks. Tooth orientation estimation is performed to determine the angle of a single oriented tooth. PCA-Stacked Gabor with ellipse Gabor banks is then used, in particular to enhance the border between the tooth and dental bone. Finally, active contour extraction is performed in order to determine tooth contour. In the experiment, in comparison with the conventional active contour without edge (ACWE) method, the average mean square error (MSE) values of extracted tooth contour points are reduced from 26.93% and 16.02% to 19.07% and 13.42% for tooth x-ray type I and type H images, respectively.
SOH Aware System-Level Battery Management Methodology for Decentralized Energy Network
Daichi WATARI Ittetsu TANIGUCHI Takao ONOYE

PAPER-VLSI Design Technology and CAD

Vol:
E103-A No:3
Page(s):
596-604
The decentralized energy network is one of the promising solutions as a next-generation power grid. In this system, each house has a photovoltaic (PV) panel as a renewable energy source and a battery which is an essential component to balance between generation and demand. The common objective of the battery management on such systems is to minimize only the purchased energy from a power company, but battery degradation caused by charge/discharge cycles is also a serious problem. This paper proposes a State-of-Health (SOH) aware system-level battery management methodology for the decentralized energy network. The power distribution problem is often solved with mixed integer programming (MIP), and the proposed MIP formulation takes into account the SOH model. In order to minimize the purchased energy and reduce the battery degradation simultaneously, the optimization problem is divided into two stages: 1) the purchased energy minimization, and 2) the battery aging factor reducing, and the trade-off exploration between the purchased energy and the battery degradation is available. Experimental results show that the proposed method achieves the better trade-off and reduces the battery aging cost by 14% over the baseline method while keeping the purchased energy minimum.
Efficient Memory Organization Framework for JPEG2000 Entropy Codec
Hiroki SUGANO Takahiko MASUZAKI Hiroshi TSUTSUI Takao ONOYE Hiroyuki OCHI Yukihiro NAKAMURA

PAPER-Realization

Vol:
E92-A No:8
Page(s):
1970-1977
The encoding/decoding process of JPEG2000 requires much more computation power than that of conventional JPEG mainly due to the complexity of the entropy encoding/decoding. Thus usually multiple entropy codec hardware modules are implemented in parallel to process the entropy encoding/decoding. This module, however, requests many small-size memories to store intermediate data, and when multiple modules are implemented on a chip, employment of the large number of SRAMs increases difficulty of whole chip layout. In this paper, an efficient memory organization framework for the entropy encoding/decoding module is proposed, in which not only existing memory organizations but also our proposed novel memory organization methods are attempted to expand the design space to be explored. As a result, the efficient memory organization for a target process technology can be explored.
Power Gating Implementation for Supply Noise Mitigation with Body-Tied Triple-Well Structure
Yasumichi TAKAI Masanori HASHIMOTO Takao ONOYE

PAPER-Circuit Design

Vol:
E95-A No:12
Page(s):
2220-2225
This paper investigates power gating implementations that mitigate power supply noise. We focus on the body connection of power-gated circuits, and examine the amount of power supply noise induced by power-on rush current and the contribution of a power-gated circuit as a decoupling capacitance during the sleep mode. To figure out the best implementation, we designed and fabricated a test chip in 65 nm process. Experimental results with measurement and simulation reveal that the power-gated circuit with body-tied structure in triple-well is the best implementation from the following three points; power supply noise due to rush current, the contribution of decoupling capacitance during the sleep mode and the leakage reduction thanks to power gating.
Real-Time Human Object Extraction Method for Mobile Systems Based on Color Space Segmentation
Gen FUJITA Takaaki IMANAKA Hyunh Van NHAT Takao ONOYE Isao SHIRAKAWA

PAPER

Vol:
E89-A No:4
Page(s):
941-949
Since a human object is an important element of the moving pictures being processed by mobile terminals, establishing a human object extraction method encourages dissemination of new applications. In accordance with the requirement of mobile applications, this paper proposes a low-cost human object extraction method, which consists of a face object and a hair object extraction based on their color information and a simple body extraction utilizing the position information of the face object. In the proposed method, skin color and hair color are estimated through color space segmentation, and a human object is effectively extracted by using a radial active contour model. Simulation results of the human object extraction with the use of XScale processor claims that QCIF 15 fps video sequences can be processed in real time.
FOREWORD
Akira TAGUCHI Takao ONOYE

FOREWORD

Vol:
E91-A No:10
Page(s):
2896-2896
SET Pulse-Width Measurement Suppressing Pulse-Width Modulation and Within-Die Process Variation Effects
Ryo HARADA Yukio MITSUYAMA Masanori HASHIMOTO Takao ONOYE

PAPER

Vol:
E97-A No:7
Page(s):
1461-1467
This paper presents a measurement circuit structure for capturing SET pulse-width suppressing pulse-width modulation and within-die process variation effects. For mitigating pulse-width modulation while maintaining area efficiency, the proposed circuit uses massively parallelized short inverter chains as a target circuit. Moreover, for each inverter chain on each die, pulse-width calibration is performed. In measurements, narrow SET pulses ranging 5ps to 215ps were obtained. We confirm that an overestimation of pulse-width may happen when ignoring die-to-die and within-die variation of the measurement circuit. Our evaluation results thus point out that calibration for within-die variation in addition to die-to-die variation of the measurement circuit is indispensable.

21-40hit(65hit)

Author Search Result

[Author] Takao ONOYE(65hit)

Implementation of Viterbi Decoder toward GPU-Based SDR Receiver

VLSI Architecture of Switching Control for AAL Type2 Switch

Thermal-Comfort Aware Online Co-Scheduling Framework for HVAC, Battery Systems, and Appliances in Smart Buildings

An In-Vehicle Auditory Signal Evaluation Platform based on a Driving Simulator

High-Level Synthesis of a Multithreaded Processor for Image Generation

Field Slack Assessment for Predictive Fault Avoidance on Coarse-Grained Reconfigurable Devices

An Embedded Zerotree Wavelet Video Coding Algorithm with Reduced Memory Bandwidth

Efficient 3-D Sound Movement with Time-Varying IIR Filters

Hardware Architecture of the Fast Mode Decision Algorithm for H.265/HEVC

Jitter Amplifier for Oscillator-Based True Random Number Generator

Reliability-Configurable Mixed-Grained Reconfigurable Array Supporting C-Based Design and Its Irradiation Testing

Low-Power Scheme of NMOS 4-Phase Dynamic Logic

Signal-Dependent Analog-to-Digital Conversion Based on MINIMAX Sampling

A Single Tooth Segmentation Using PCA-Stacked Gabor Filter and Active Contour

SOH Aware System-Level Battery Management Methodology for Decentralized Energy Network

Efficient Memory Organization Framework for JPEG2000 Entropy Codec

Power Gating Implementation for Supply Noise Mitigation with Body-Tied Triple-Well Structure

Real-Time Human Object Extraction Method for Mobile Systems Based on Color Space Segmentation

FOREWORD

SET Pulse-Width Measurement Suppressing Pulse-Width Modulation and Within-Die Process Variation Effects

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles